Skip to content

feat: add --upload-content-type flag and smart MIME inference for uploads#429

Merged
jpoehnelt merged 1 commit intogoogleworkspace:mainfrom
Huglo:feat/upload-content-type
Mar 13, 2026
Merged

feat: add --upload-content-type flag and smart MIME inference for uploads#429
jpoehnelt merged 1 commit intogoogleworkspace:mainfrom
Huglo:feat/upload-content-type

Conversation

@Huglo
Copy link
Contributor

@Huglo Huglo commented Mar 12, 2026

Summary

  • Adds --upload-content-type flag for multipart uploads, allowing the media Content-Type to be set independently from the metadata mimeType
  • Infers media MIME type from file extension (e.g. .mdtext/markdown) when the flag is omitted, so Drive import conversions work automatically
  • Falls back to metadata mimeType only for unrecognized extensions, preserving backward compatibility

Previously, uploading notes.md with "mimeType":"application/vnd.google-apps.document" would label the media bytes as a Google Doc. Now the media part correctly gets text/markdown, and Drive performs the Markdown → Google Docs conversion.

Test plan

Made with Cursor

…oads

The multipart upload media Content-Type is now resolved independently
from the metadata mimeType, enabling Drive import conversions (e.g.
Markdown → Google Docs) to work automatically.

Priority order for the media MIME type:
1. --upload-content-type flag (explicit override)
2. File extension inference (best guess for what the bytes are)
3. Metadata mimeType (backward-compat fallback)
4. application/octet-stream

Previously the metadata mimeType was reused for the media part, which
meant uploading `notes.md` with mimeType set to
`application/vnd.google-apps.document` would incorrectly label the
bytes as a Google Doc instead of text/markdown.

Made-with: Cursor
@Huglo Huglo requested a review from jpoehnelt as a code owner March 12, 2026 13:19
@changeset-bot
Copy link

changeset-bot bot commented Mar 12, 2026

🦋 Changeset detected

Latest commit: a4bfb58

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@googleworkspace/cli Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@googleworkspace-bot googleworkspace-bot added area: http area: core Core CLI parsing, commands, error handling, utilities labels Mar 12, 2026
@google-cla
Copy link

google-cla bot commented Mar 12, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the file upload functionality by introducing a dedicated flag for specifying the media's content type and implementing intelligent MIME type inference. These changes streamline the process of uploading files to services like Google Drive, particularly for scenarios requiring automatic format conversions, by ensuring the correct content type is communicated for the actual file bytes, independent of the target metadata type.

Highlights

  • Explicit Content-Type Flag: Introduced a new --upload-content-type flag to explicitly set the media Content-Type for multipart uploads, allowing it to be independent of the metadata mimeType.
  • Smart MIME Inference: Implemented automatic inference of the media MIME type from the file extension (e.g., .md to text/markdown) when the --upload-content-type flag is not provided.
  • Improved Drive Conversions: Enabled automatic import conversions in Google Drive by correctly setting the media Content-Type based on file content, allowing files like Markdown to be converted to Google Docs during upload.
  • Backward Compatibility: Ensured backward compatibility by falling back to the metadata mimeType only when the file extension is unrecognized for MIME inference.
Changelog
  • .changeset/upload-content-type.md
    • Added a new changeset file documenting the --upload-content-type flag and smart MIME inference feature.
  • src/commands.rs
    • Added the --upload-content-type command-line argument to methods supporting media uploads.
  • src/executor.rs
    • Modified build_http_request to accept the new upload_content_type parameter.
    • Implemented resolve_upload_mime function to determine the media MIME type based on explicit flag, file extension, or metadata.
    • Added mime_from_extension helper function to infer MIME types from file extensions.
    • Updated build_multipart_body to use the resolved media MIME type.
    • Adjusted test calls to build_multipart_body and execute_method to accommodate the new parameter.
    • Added new unit tests for resolve_upload_mime covering various scenarios (explicit, extension, metadata fallback, unknown extension, no metadata).
  • src/helpers/calendar.rs
    • Updated execute_method call to include the new upload_content_type parameter (passed as None).
  • src/helpers/chat.rs
    • Updated execute_method call to include the new upload_content_type parameter (passed as None).
  • src/helpers/docs.rs
    • Updated execute_method call to include the new upload_content_type parameter (passed as None).
  • src/helpers/drive.rs
    • Updated execute_method call to include the new upload_content_type parameter (passed as None).
  • src/helpers/gmail/mod.rs
    • Updated execute_method call to include the new upload_content_type parameter (passed as None).
  • src/helpers/script.rs
    • Updated execute_method call to include the new upload_content_type parameter (passed as None).
  • src/helpers/sheets.rs
    • Updated execute_method calls to include the new upload_content_type parameter (passed as None).
  • src/main.rs
    • Parsed the new --upload-content-type command-line argument.
    • Passed the upload_content_type to the execute_method function.
    • Updated the usage printout to include the new flag.
Activity
  • Implemented the --upload-content-type flag and smart MIME inference.
  • Ran cargo test, with 556 tests passing, including 7 new tests for resolve_upload_mime.
  • Executed cargo clippy -- -D warnings, resulting in a clean output.
  • Verified no overlap with existing PRs, specifically noting fix: stream multipart uploads to avoid OOM on large files #418.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Generative AI Prohibited Use Policy, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature for file uploads by adding the --upload-content-type flag and intelligent MIME type inference. The implementation is clean and well-tested, and the separation of concerns in resolve_upload_mime is good. I have one suggestion to improve the robustness of the file extension detection logic to better handle common edge cases like dotfiles.

@peterHoburg
Copy link

Related to #380 ?

@Huglo
Copy link
Contributor Author

Huglo commented Mar 13, 2026

Related to #380 ?

Absolutely! This was actually my initial reason to look into the issue. When checking if it was already tackled, I admit that I mostly looked into the list of MRs and branches to make sure I was not duplicating effort and I missed the open issue.

It feels that my fix would solve the issue with the auto-detection suggested in #380 and keeping the cli as a thin transport layer. But I am happy to be guided in another direction 🙏.

@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 69.76744% with 39 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.44%. Comparing base (7e22a3d) to head (a4bfb58).
⚠️ Report is 29 commits behind head on main.

Files with missing lines Patch % Lines
src/commands.rs 0.00% 13 Missing ⚠️
src/executor.rs 89.10% 11 Missing ⚠️
src/main.rs 0.00% 7 Missing ⚠️
src/helpers/sheets.rs 0.00% 2 Missing ⚠️
src/helpers/calendar.rs 0.00% 1 Missing ⚠️
src/helpers/chat.rs 0.00% 1 Missing ⚠️
src/helpers/docs.rs 0.00% 1 Missing ⚠️
src/helpers/drive.rs 0.00% 1 Missing ⚠️
src/helpers/gmail/mod.rs 0.00% 1 Missing ⚠️
src/helpers/script.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #429      +/-   ##
==========================================
+ Coverage   64.40%   64.44%   +0.04%     
==========================================
  Files          38       38              
  Lines       15584    15698     +114     
==========================================
+ Hits        10037    10117      +80     
- Misses       5547     5581      +34     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@jpoehnelt jpoehnelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jpoehnelt jpoehnelt merged commit dc561e0 into googleworkspace:main Mar 13, 2026
34 of 37 checks passed
@Huglo
Copy link
Contributor Author

Huglo commented Mar 13, 2026

🙏
I can see some lint error @jpoehnelt. Is it on me (in which case I am surprised it was not flagged before merging) or are you taking care of it?

malob added a commit to malob/cli that referenced this pull request Mar 15, 2026
…lder

Replace custom MessageBuilder, RFC 2047 encoding, header sanitization,
and address encoding (including googleworkspace#482) with the mail-builder crate
(Stalwart Labs, 0 runtime deps). Each command builds a
mail_builder::MessageBuilder directly.

Introduce structured types throughout:
- Mailbox type (parsed display name + email) replaces raw string passing
- sanitize_control_chars strips ASCII control characters (CRLF, null,
  tab, etc.) at the parse boundary — defense-in-depth for mail-builder's
  structured header types, superseding sanitize_header_value,
  sanitize_component, and encode_address_header from googleworkspace#482
- OriginalMessage fields use Option<T> instead of empty-string sentinels
- parse_original_message returns Result with validation (threadId, From,
  Message-ID)
- Pre-parsed Config types (SendConfig, ForwardConfig, ReplyConfig) with
  Vec<Mailbox> — parse at the boundary, not downstream
- parse_forward_args and parse_send_args return Result with --to
  validation, consistent with parse_reply_args
- parse_optional_mailboxes helper normalizes Some(vec![]) to None for
  optional address fields (--cc, --bcc, --from)
- Envelope types borrow from Config + OriginalMessage with lifetimes
- Message IDs stored bare (no angle brackets), parsed once at boundary
- References stored as Vec<String> instead of space-separated string
- ThreadingHeaders bundles In-Reply-To + References with debug_assert
  for bare-ID convention
- Shared CLI arg builders (common_mail_args, common_reply_args)
  eliminate duplicated --cc/--bcc/--html/--dry-run definitions

Additional improvements:
- finalize_message returns Result instead of panicking via .expect()
- Mailbox::parse_list filters empty-email entries (trailing comma edge
  case)
- format_email_link percent-encodes mailto hrefs to prevent parameter
  injection
- Forward date handling: omits Date line when absent instead of showing
  empty "Date: "
- Dry-run auth: log skipped auth as diagnostic instead of silently
  discarding errors
- Restore --html tips in after_help strings (gmail_quote CSS, cid:
  image warnings, HTML fragment advice) lost in release PR googleworkspace#434
- Update execute_method call for upload_content_type parameter (googleworkspace#429)

Delete: MessageBuilder, encode_header_value, sanitize_header_value,
encode_address_header, sanitize_component, extract_email,
extract_display_name, split_mailbox_list, build_references.
jpoehnelt pushed a commit that referenced this pull request Mar 17, 2026
…lder

Replace custom MessageBuilder, RFC 2047 encoding, header sanitization,
and address encoding (including #482) with the mail-builder crate
(Stalwart Labs, 0 runtime deps). Each command builds a
mail_builder::MessageBuilder directly.

Introduce structured types throughout:
- Mailbox type (parsed display name + email) replaces raw string passing
- sanitize_control_chars strips ASCII control characters (CRLF, null,
  tab, etc.) at the parse boundary — defense-in-depth for mail-builder's
  structured header types, superseding sanitize_header_value,
  sanitize_component, and encode_address_header from #482
- OriginalMessage fields use Option<T> instead of empty-string sentinels
- parse_original_message returns Result with validation (threadId, From,
  Message-ID)
- Pre-parsed Config types (SendConfig, ForwardConfig, ReplyConfig) with
  Vec<Mailbox> — parse at the boundary, not downstream
- parse_forward_args and parse_send_args return Result with --to
  validation, consistent with parse_reply_args
- parse_optional_mailboxes helper normalizes Some(vec![]) to None for
  optional address fields (--cc, --bcc, --from)
- Envelope types borrow from Config + OriginalMessage with lifetimes
- Message IDs stored bare (no angle brackets), parsed once at boundary
- References stored as Vec<String> instead of space-separated string
- ThreadingHeaders bundles In-Reply-To + References with debug_assert
  for bare-ID convention
- Shared CLI arg builders (common_mail_args, common_reply_args)
  eliminate duplicated --cc/--bcc/--html/--dry-run definitions

Additional improvements:
- finalize_message returns Result instead of panicking via .expect()
- Mailbox::parse_list filters empty-email entries (trailing comma edge
  case)
- format_email_link percent-encodes mailto hrefs to prevent parameter
  injection
- Forward date handling: omits Date line when absent instead of showing
  empty "Date: "
- Dry-run auth: log skipped auth as diagnostic instead of silently
  discarding errors
- Restore --html tips in after_help strings (gmail_quote CSS, cid:
  image warnings, HTML fragment advice) lost in release PR #434
- Update execute_method call for upload_content_type parameter (#429)

Delete: MessageBuilder, encode_header_value, sanitize_header_value,
encode_address_header, sanitize_component, extract_email,
extract_display_name, split_mailbox_list, build_references.
jpoehnelt added a commit that referenced this pull request Mar 17, 2026
…#526)

* refactor(gmail): replace hand-rolled email construction with mail-builder

Replace custom MessageBuilder, RFC 2047 encoding, header sanitization,
and address encoding (including #482) with the mail-builder crate
(Stalwart Labs, 0 runtime deps). Each command builds a
mail_builder::MessageBuilder directly.

Introduce structured types throughout:
- Mailbox type (parsed display name + email) replaces raw string passing
- sanitize_control_chars strips ASCII control characters (CRLF, null,
  tab, etc.) at the parse boundary — defense-in-depth for mail-builder's
  structured header types, superseding sanitize_header_value,
  sanitize_component, and encode_address_header from #482
- OriginalMessage fields use Option<T> instead of empty-string sentinels
- parse_original_message returns Result with validation (threadId, From,
  Message-ID)
- Pre-parsed Config types (SendConfig, ForwardConfig, ReplyConfig) with
  Vec<Mailbox> — parse at the boundary, not downstream
- parse_forward_args and parse_send_args return Result with --to
  validation, consistent with parse_reply_args
- parse_optional_mailboxes helper normalizes Some(vec![]) to None for
  optional address fields (--cc, --bcc, --from)
- Envelope types borrow from Config + OriginalMessage with lifetimes
- Message IDs stored bare (no angle brackets), parsed once at boundary
- References stored as Vec<String> instead of space-separated string
- ThreadingHeaders bundles In-Reply-To + References with debug_assert
  for bare-ID convention
- Shared CLI arg builders (common_mail_args, common_reply_args)
  eliminate duplicated --cc/--bcc/--html/--dry-run definitions

Additional improvements:
- finalize_message returns Result instead of panicking via .expect()
- Mailbox::parse_list filters empty-email entries (trailing comma edge
  case)
- format_email_link percent-encodes mailto hrefs to prevent parameter
  injection
- Forward date handling: omits Date line when absent instead of showing
  empty "Date: "
- Dry-run auth: log skipped auth as diagnostic instead of silently
  discarding errors
- Restore --html tips in after_help strings (gmail_quote CSS, cid:
  image warnings, HTML fragment advice) lost in release PR #434
- Update execute_method call for upload_content_type parameter (#429)

Delete: MessageBuilder, encode_header_value, sanitize_header_value,
encode_address_header, sanitize_component, extract_email,
extract_display_name, split_mailbox_list, build_references.

* feat(gmail): add --from flag to +send for send-as alias support

Consistent with +reply, +reply-all, and +forward which already support
--from. Uses the same parse_optional_mailboxes path and
apply_optional_headers plumbing.

* fix: quote display names with RFC 2822 special characters in +reply

When replying to emails from corporate senders with display names like
"Anderson, Rich (CORP)" <email@adp.com>, the +reply command fails with
"Invalid To header" (400) from the Gmail API.

The root cause: encode_address_header() strips quotes from the display
name via extract_display_name(), then reconstructs the address without
re-quoting. When the display name contains RFC 2822 special characters
(commas, parentheses), the unquoted form is ambiguous — commas split
it into multiple malformed mailboxes and parentheses are interpreted
as RFC 2822 comments.

Fix: re-quote the display name when it contains any RFC 2822 special
characters, using a single-pass character iterator that preserves
already-escaped sequences and escapes bare quotes/backslashes.

Fixes #512

* feat(gmail): add --attachment flag, +read helper, and mail-builder migration

Consolidates PRs #491, #513, #517, and #502 into a single rollup:

- Migrate message construction to mail-builder crate (RFC-compliant MIME)
- Add --from flag to +send for send-as alias support
- Add --attachment flag to +send with MIME auto-detection and path validation
- Add +read helper for extracting message body/headers (text, HTML, JSON)
- Serialize support for OriginalMessage and Mailbox types
- Display name quoting handled natively by mail-builder

* chore: regenerate skills [skip ci]

* fix: use validate_safe_file_path for attachment path validation

Addresses Gemini review: validate_safe_dir_path hardcodes '--dir' in
error messages. validate_safe_file_path accepts the flag name, so errors
now correctly reference '--attachment'.

* refactor: make OriginalMessage.thread_id optional

The Gmail API does not guarantee threadId on all message resources
(e.g. drafts). Making it Option<String> prevents parse failures on
valid messages and avoids requiring thread_id in helpers like +read
that don't use it.

* fix: use canonicalized path for attachment file operations (TOCTOU)

validate_safe_file_path returns a canonicalized PathBuf. Use it for
exists/is_file checks and downstream file reads instead of the original
un-resolved path to prevent time-of-check/time-of-use races.

* feat(gmail): add --attach flag for file attachments

Add -a/--attach to +send, +reply, +reply-all, and +forward. Can be
specified multiple times for multiple attachments. MIME type is auto-
detected via mime_guess2. Closes #247.

Send via the Gmail API upload endpoint (multipart/related with
message/rfc822 media type) instead of base64-encoding into a JSON raw
field. This raises the size limit from ~5MB (metadata-only endpoint) to
35MB (upload endpoint, per discovery document).

Introduce UploadSource enum in the executor to consolidate upload_path,
upload_content_type, and upload_bytes into a single type-safe parameter.
File and Bytes variants make the two upload strategies (from disk vs.
from memory) mutually exclusive by construction.

Validates attachment paths (control characters, regular file, non-empty)
and total size (25MB raw limit, accounting for base64 expansion of
attachments within the MIME message against the 35MB API limit). Size
check uses actual bytes read to avoid TOCTOU race.

* chore: update changeset and fix integration with malob's attachment impl

Update changeset to reflect combined work. Fix thread_id type mismatches
in new tests from cherry-pick. Fix upload_path scope in main.rs. Make
reject_control_chars pub(crate) for attachment validation.

Co-authored-by: Malo Bourgon <mbourgon@gmail.com>

* chore: regenerate skills [skip ci]

* fix: restore MIME sanitization and terminal escape protection in executor

Restore two security features accidentally lost during the UploadSource
refactor:

1. resolve_upload_mime: restructure from early-returns to collect-then-
   sanitize pattern — strips control chars from user-supplied MIME types
   to prevent CRLF header injection.

2. Model Armor error path: restore sanitize_for_terminal on error messages
   to prevent terminal escape sequence injection from API responses.

Co-authored-by: Malo Bourgon <mbourgon@gmail.com>

* chore: remove duplicate changeset from cherry-pick

gmail-attach-flag.md duplicated content already in gmail-helpers-rollup.md.
Both were marked minor, which would cause a double version bump.

* fix: add path traversal protection to attachment validation

Replace reject_control_chars with validate_safe_file_path in
parse_attachments. All file operations (metadata, read, filename
extraction, MIME detection) now use the canonicalized path, preventing
path traversal attacks (e.g. ../../.ssh/id_rsa) and closing TOCTOU gaps.

Update tests to use CWD-relative temp directories (tempdir_in("."))
since validate_safe_file_path rejects paths outside the working directory.

Co-authored-by: Malo Bourgon <mbourgon@gmail.com>

* refactor: deduplicate terminal sanitizer in read.rs

Replace the local sanitize_terminal_output function with the existing
crate::error::sanitize_for_terminal via import alias. This eliminates
code duplication and provides consistent sanitization across the codebase.

The crate-wide sanitizer also correctly strips CR (carriage return) which
can be abused for terminal overwrite attacks.

---------

Co-authored-by: Malo Bourgon <mbourgon@gmail.com>
Co-authored-by: Rich Anderson <richanderson00@gmail.com>
Co-authored-by: jpoehnelt-bot <jpoehnelt-bot@users.noreply.github.com>
Co-authored-by: googleworkspace-bot <googleworkspace-bot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: core Core CLI parsing, commands, error handling, utilities area: http

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants